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Abstract 

In this paper we look at the problem of scheduling tasks on a single-processor system, where each task requires unit time and 
must be scheduled within a certain time window, and each task can be added to or removed from the system at any time. On 
each operation, the system is allowed to reschedule any tasks, but the goal is to minimize the number of rescheduled tasks. 
Our main result is an allocator that maintains a valid schedule for all tasks in the system if their time windows have constant 
size and reschedules 0(j log('|)) tasks on each insertion as e 0, where e is a certain measure of the schedule flexibility of 
the system. We also show that it is optimal for any allocator that works on arbitrary instances. We also briefly mention a few 
variants of the problem, such as if the tasks have time windows of difference sizes, for which we have an allocator that we 
conjecture reschedules only 1 task on each insertion if the schedule flexibility remains above a certain threshold. 
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1 Introduction 


Scheduling problems with time restrictions arise everywhere, from doctors’ appointments to distributed systems. Traditionally 
the focus is on finding a good allocation of resources under intrinsic constraints such as availability of resources, deadlines 
and dependencies, and extrinsic requirements such as fairness and latency. Sometimes all we need is a solution that satisfies 
the constraints. At other times we desire an allocation that maximizes some objective function, involving various factors such 
as the resources used, throughput and latency. 

In online scheduling, it is often possible or even necessary to reallocate resources that had been previously reserved for 
preceding requests so as to satisfy a new request. The cost of reallocation should therefore be taken into account as well. 
Furthermore, it would be preferable if the system can be loaded to nearly its full capacity and yet have only a small reallocation 
cost for each new request serviced. 

2 The real-time arbitrary reallocation problem 

We have p identical processors in our system, and each of them starts with an empty schedule. An insert operation is the 
insertion of a new task into the system with a specified length and time window in which it has to be executed. The task has 
to be allocated to one of the processors at some time slot of the required length within the given window such that it does not 
overlap with the time slot of any other task currently in the processor’s schedule. To do so, we may have to reallocate time 
slots for other tasks in the schedule, and possibly even reallocate some tasks to different processors in the system. We must 
allow any insertion that is feasible, even if it requires reallocating all the tasks. A deletion operation is simply the deletion of 
a task currently in the system. We allow reallocations on deletions, but all the allocators described in this paper will not make 
any reallocations. The goal is to minimize the number of such reallocations on each insertion in the worst case. 

It is easy to see that if tasks can be of different lengths or the schedule is allowed to become completely packed, there is 
no efficient allocator (i.e. for any n there is a sequence of operations that force n(n) reallocations per insertion on average 
although there are at most n tasks in the system at any time). The situation improves when tasks are of unit length and there is 
some slack in the schedule, which is hence the focus of this paper. Indeed, the main result of this paper is that for unit-length 
tasks and fixed-length windows, given any e > 0 it is possible to maintain a valid schedule with only 0(1) reallocations on 
each insertion as long as the instance (i.e. set of tasks) prior to insertion is e-slack (i.e. there is a valid schedule even if those 
tasks are of length (1 -I- e) instead). 

3 Main results 

We demonstrate a single-processor allocator FA for fixed-length windows that takes 0(i log(i)) reallocations per insertion 
for arbitrary e-slack instances, which is independent of the number of tasks n and even the window length, and does not 
reallocate any tasks on deletion. Also, its time complexity is only 0(i log('i) log(n) -|-log(n)^) per insertion and 0(log(n)) 
per deletion if it is allowed to maintain an internal state. We also present a hard insert state (i.e. a current allocation and a new 
insertion) that gives a matching Lower Bound of n(i ln( j)) reallocations for any allocator that solves it. 

This resolves a question posed in Bender et al. (2013) [3]. They investigated the variant with variable-length windows but 
where the system is always e-slack for some large constant e, and obtained an allocator that uses only 0(log*(min(n, c)) 
reallocations per insertion where c is the maximum window length. Then they asked if there is an efficient allocator for 
arbitrarily small e, which our paper answers completely, in the positive for the special case of one processor and hxed- 
length windows, where our allocator uses only 0(1) reallocations per insertion, and in the negative for multiple processors or 
variable-length windows. 


2 


4 Related work 


Of the wide variety of scheduling problems, completely offline problems from real-world situations tend to be NP-complete 
even after simplifications. (Li et al., 2008) [13] But practical scheduling problems usually involve continuous processes or 
continual requests, and hence many other scheduling problems are online in some sense. In addition, scheduling problems 
with intrinsic constraints invariably have to include some means of handling conflicting requests. 

4.1 Dominant resource cost 

A large class of online scheduling problems have focused on minimizing the maximum machine load or the makespan, or on 
maximizing utilization, which are typical targets in static optimization problems. Numerous types of rescheduling have been 
studied and we shall give only a few examples. 

Adaptive rescheduling 

Hoogeveen et al. (2012) [12] developed makespan minimizing heuristics where tasks have deadlines and also a setup time if 
the previous task is of a different type, and insertion of new tasks must not result in any unnecessary additional setups. Castillo 
et al. (2011) [4] proposed a framework similar to this paper’s where each task has a total required duration and a time window 
within which it must be executed, and each request must either be accepted or rejected, immediately and permanently. But 
their goal is to minimize the number of rejected requests while maximizing utilization. 

If however revoking or reallocating earlier requests is permitted, accepting new conflicting requests may become possible. 
For example, Faigle and Nawijn (1995) [10] presented an optimal online algorithm for such a problem, where each request 
demand a certain length of service time starting from the time of the request, and must be immediately assigned to a service 
station or rejected, where each service station can service only one request at any one time. A request is considered unfulfilled 
if it is rejected or if its service is interrupted, and the objective is to minimize the number of unfulfilled requests. 

In incremental scheduling, value is accrued over time according to the activities performed and the resources used. Gallagher 
et al. (2006) [11] presented techniques for one variant of incremental scheduling with insufficient resources where each 
activity requires a minimum time period and each pair of activities requires a setup time in-between. Whenever the set of 
possible activities changes, changing the schedule may yield higher value. Their simulations found that a local adjustment 
algorithm results in a rather stable schedule that performs well compared to a greedy global rescheduling. 

Delayed rescheduling 

In one problem restricted reallocations are allowed at the end of the entire request sequence. Tan and Yu (2008) [18] introduced 
3 such variants for two machines, where tasks of arbitrary lengths are to be completed in any order. On each request, the task 
must be immediately assigned to a single machine. After the entire sequence of requests, certain tasks can be reassigned to a 
different machine. In the first variant, the last k tasks assigned can be reassigned. In the second, only the last task assigned to 
each machine can be reassigned. And in the third, any k tasks can be reassigned. They demonstrate algorithms to minimize 
maximum machine load with optimal competitive ratios. Min et al. (2011) [15] proposed a fourth variant where the last task 
of only one machine can be reassigned, and showed that it has the same competitive ratio as the second variant. The fifth 
variant is like the fourth but the total length of all tasks is known beforehand, and they presented an optimal algorithm with 
competitive ratio |. Liu et al. (2009) [14] and Chen et al. (2011,2012) [5, 19] investigated the generalization where the two 
machines are of speeds 1 and s, and k arbitrary tasks can be reassigned at the end, and established optimal algorithms for 
some ranges of values of s. 

Another common method to improve scheduling for a batch of tasks is to use a buffer to store tasks before assigning them. 
The many variations (Dosa and Epstein, 2010; Chen et al, 2013) [8, 6] of the problem are due to the number of machines, 
their speeds, and whether some tasks can be executed only on certain machines, among other factors. Sun and Fan (2013) [17] 
analyzed one such load minimization problem for p identical machines, where the system has a fixed-size buffer. On each 
request, the task must be assigned permanently to a machine or stored in the buffer if it is not full. Each task in the buffer can 
be assigned permanently to a machine at any time. They improved the upper bound on the buffer sizes for both the optimally 
competitive algorithm for large p and a 1.5-competitive algorithm. 
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Real-time rescheduling 

The common drawback to delayed rescheduling models is that they are inapplicable to continuous real-time systems where, 
on each request, immediate resource allocations must be made so that a valid schedule is always maintained. 

Dosa et al. (2011) [9] proved that for a finite set of tasks and two machines with different speeds, using a buffer is not as 
efficient as bounded reallocations, where on each request k previously allocated tasks can be reallocated together with the new 
task. They also give an optimal bounded reallocation algorithm for certain parameter ranges. 

Sanders et al. (2009) [16] considered proportionally bounded reallocation cost instead, such that for each new task of size 
L that is inserted, any set of currently allocated tasks that have total size bounded by rL can be reallocated to a different 
machine. They determined that for any y > 1 there is such a scheduler that is y-competitive for some r. Like this paper, this 
explores the intermediate class of scheduling algorithms between exact optimal algorithms and y-competitive algorithms. 

4.2 Dominant reallocation cost 

Real-time scheduling problems tend to involve resources that have already been reserved for the scheduling system, and 
thereby it is not unusual that the dominant cost to be minimized is not the resource cost but the reallocation cost. In general, 
the system is also y-underallocated or e-slack in some sense, in other words the capacity is always at least y = 1 -f e times 
the load under some suitable measure, and we seek algorithms whose performance degrades gracefully as load approaches 
capacity. 

Davis et al. (2006) [7] put forward a neat problem with a pool of a single resource of total size T, and n users, each requiring 
a certain number of the resource at each time step. At each step, the scheduler has to distribute the pool of resources to the 
users without knowing their requirements but only knowing which of them are satisfied. To do so must repeatedly change the 
resource allocations until all are satisfied, with the aim of minimizing the total number of changes to the user allotments. They 
devise a randomized algorithm that is 0(logy(n))-competitive if the size of the resource pool is increased to yT given any 
y > 1, and also show that an expanded resource pool is necessary for any /(n)-competitive algorithm given any function /. 
This illustrates the smooth trade-off between the underallocation and the reallocation cost. 

In a similar direction. Bender et al. (2014) [1] gave an optimal cost-oblivious algorithm to maintain, given e > 0, an allocation 
of memory blocks (with no window constraints) that has makespan within (1 -f e) times the optimal, with a reallocation cost 
of 0(|-log(i)) times the optimal as long as the reallocation cost is subadditive and monotonic in the block size. And in 
[2] they gave a cost-oblivious algorithm to maintain an allocation of tasks to multiple processors that has sum of completion 
times within a constant factor of optimal, with a reallocation cost within 0(1) of the optimal if the reallocation cost is strongly 
subadditive. 

The real-time arbitrary reallocation problem defined at the start of this paper also has nonzero underallocation and dominant 
allocation cost, and there is again a trade-off between minimizing the number of reallocations and minimizing the amount of 
underallocation needed. Bender et al. (2013) [3] investigated one variant where the system is always y-underallocated for 
some constant y, and obtained an algorithm for sufficiently large y that uses only 0(log* (min(n,c)) reallocations where n is 
the current number of tasks and c is the maximum window length. It is not apparent whether this is asymptotically optimal, 
and they also ask if there is an algorithm for arbitrary y > 1, which would be more practical. As such, this paper goes in that 
direction, and shows that for fixed-length task windows there is indeed a single-processor reallocation scheduler such that the 
number of reallocations needed on each task insertion is dependent on only the current underallocation. We also show partial 
results for variable-length task windows, firstly that it is impossible if underallocation is below some threshold, and secondly 
that a nonzero reallocation rate is inevitable regardless of underallocation. For sufficiently large underallocation, we have an 
allocator that we conjecture makes at most a single reallocation per insertion. 
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5 Summary 


In this paper an instance is a set of tasks in the system, and is said to be y-underallocated and e-slack iff there is still a solution 
when task lengths are multiplied by y = 1 -f e, which we shall call a y-solution. The slack of an instance is then defined as the 
maximum e such that it is e-slack, and this maximum can be seen to exist by compactness. We decided to look mainly at the 
unit-task case. 

The first variant is where the window lengths are all the same. We explicitly detail a single-processor allocator FA that needs 
only ln( j)) reallocations per insertion for arbitrary instances, which is independent of the number of tasks n and even the 

window length, and does not reallocate any tasks on deletion. Also, its time complexity is only 0(i log(i) log(n) -f log(n)^) 
per insertion and 0(log(n)) per deletion if it is allowed to maintain an internal state. To that end we prove a few preliminary 
results that would be frequently used throughout the proof of this allocator, including the basic theorem Ordering and proce¬ 
dures Leftmost and Near. We also construct a hard situation that gives a Lower Bound of ln(i)) reallocations, suggesting 
that an allocator might in fact need that number of reallocations for some operations, although we do not know whether there 
is an allocator that can perpetually avoid such situations altogether and, nor if there is one that has asymptotically lower 
amortized reallocation cost. 

The second variant is where window lengths can be different, where Bender et al. asked if there is an efficient allocator 
for arbitrary positive slack [3, Open questions]. We answer the question in the negative under rather weak conditions in 
Section 9.1. Specifically, we say that there is no efficient allocator iff forp processors and for any n there is a sequence of 
operations that force 0 (p) reallocations per insertion on average although there are at most n tasks in the system at any time. 

Then for any £ < 3 there is no efficient allocator even if the instance is always £-slack. This raises the question of whether 
there is an efficient allocator if £ > ^. At the other extreme, regardless of the underallocation there is still a Reallocation 
Requirement, where any allocator can be forced to make at least one reallocation for some sequence of insertions. 

If the instance always remains 4y-underallocated where y is a power of 2, we can Align it and maintain a solution for the 
aligned instance, where windows have endpoints recursively aligned to powers of 2. This Alignment Reduction was mentioned 
by Bender et. al. [3, Lemma 10], but as stated there it is incorrect ^ because there is a counter-example for y = 27. As they 
noted in [3, Lemma 4], any insertion such that the instance remains aligned can be solved in 0(log(n)) reallocations, and 
this is asymptotically tight for some sequence of operations if there is no Underallocation Requirement. Furthermore, all 
allocators have Non-Genericity, in the sense that for any y and as n — c» there is some y-underallocated insert state which 
requires n(log(n)) reallocations to be solved. Therefore any allocator that does better must completely avoid such insert 
states. We believe that it is indeed possible to maintain a solution if the instance is always aligned and 2-underallocated, and 
in fact we have an allocator VA that we conjecture makes at most 1 reallocation per insertion. Empirically it worked in all our 
experiments using partially random operations, but the search space is too large for these tests to be reliable. 

Finally we reduce the p-processor problem to the 1-processor problem in Section 10.1, giving an allocator that is guaranteed 
to work if the original slack is larger than 2 2^. It is doubtful that this allocator is optimal, but it is also not clear how to do 
better. We also show in Section 10.2 that even if all windows are of the same length and the instance always remains £-slack, 
there is no efficient allocator if p > 1 and £ < This is a rather strong negative result in light of our efficient allocator FA 

for one processor, but leaves unanswered what happens when z £ [ 4 ^^! > 2 ^^]. 

At the end we briefly discuss the case of variable-length tasks. Even if the instance is always y-slack before each inser¬ 
tion, there is no efficient allocator regardless of how large y is. And even if the instance is always y-slack (including after 
insertions), there is no efficient allocator if y < 2 , but we do not know what happens when y > 2 . 

6 Data structures 

Although the algorithms described guarantee the existence of a solution with low reallocation cost, actually finding the so¬ 
lution would be inefficient without the following data structures. An IDSet is an iterable ordered set data structure based on 
immutable balanced binary trees with all data at the leaves, and it allows access by both id and position, besides the usual 
update and search operations. An IDSetRQ is an augmented version of IDSet to also allow range queries for any user-defined 
associative binary function. These data structures take worst-case 0(log(n)) time for any operation, including deep copying. 


* Their main resuit is stiii vaiid because it depends oniy on the correct reduction theorem. 
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7 Definitions 


We shall begin by defining general symbolic notation that are not universal but will be used in this paper to express all 
statements precisely yet concisely, and then we shall define terms specific to the reallocation problem. 

7.1 General symbolic notation 

Let null be a sentinel value denoting “nothing”. 

Let x[i\ denote the element in x indexed by i for any sequence x and integer i. 

Let x[y\ = {x[i\ : i €y) for any sequence x and integer sequence y, where denotes “is generated in order by”. 

Let [a.. 6 ] be the strictly increasing sequence {x : x € ZAa < x <b) for any reals a,b, and let x\a..b] = (x[i] : i G [a.. 6 ]). 
Let 0 denote the empty sequence, and for any x let seq(a;) be the sequence that has x as its only element. 

Let a; • y be the concatenation of x followed by y for any finite sequences x, y. 

Let x.first and x.last be the first and last element in x respectively for any sequence x, and let #(x) be its length. 

Let X.start and X.end be the start and end respectively of X for any interval X, and let span(X) be its length/span. 

Let < on intervals be the partial ordering such that X <Y ^ X.start < y.start AX.end < L.end for any intervals X,Y. 

For convenience let left/right be associated with earlier/later for interval comparison. 

Call a integer/interval sequence x ordered iff its non-null elements are in increasing order, and for any integer sequence y let 
“sort x[y]” mean “permute the non-null elements among x\y\ such that x[y] is ordered”. 

Let c + [o, 6] = [a + c, 6 + c] and [a,b]c= [ac, hc\ for any reals a,b,c. 

Let span( S) = max^es ^.end — minxes' -^-start for any finite set/sequence of intervals S. 

Let {P 1 X \Y ) evaluate to X iff P = true and Y otherwise for any boolean P and expressions X, Y. 

7.2 Reallocation problem terminology 

Take any instance / = (n, T, W) that comprises a set of n unit tasks T[l..n] and their windows W[l..n]. For convenience we 
shall often not mention the tasks but associate an allocated slot directly with the task’s window. 

Call S a valid allocation for I iff it is an allocation of tasks in I such that each task T[i\ in I is allocated to the slot S'[i] 
of unit length within W[i] or S'[i] = null if T[i] is unallocated, and no two tasks in I are allocated to overlapping slots. For 
convenience we shall use “slot” to refer to a unit interval unless otherwise specified. 

Call / ordered iff FF[l..n] is ordered. If so, call a valid allocation S for I ordered iff S'[l..n] is ordered. (Unallocated tasks 
are ignored.) 

Call S a solution for / iff S' is a valid allocation for I that allocates all tasks in I. 

Call S a partial solution for (/, k) iff S is a valid allocation for / that allocates all tasks in I except T[k]. 

Call S a y-solution for / iff S is a solution for l' where l' is / with all task lengths multiplied by y. 

Call S a y-partial solution for (/, k) iff S is a partial solution for {!', k) where l' is I with all task lengths multiplied by y. 
Call I feasible iff there is a solution for /, and call / y-underallocated iff there is a y-solution for I. 

Call I e-slack iff / is (1 + e)-underallocated, and let the underallocation of / be the maximum such e, which clearly exists. 

Call (/, S, k) an insert state iff S' is a partial solution for (/, k), and call it ordered iff I is ordered, feasible iff I is feasible, 
and (1 + e)-underallocated or equivalently e-slack iff there is a (1 + e)-partial solution for (/, k). 

Call an insertion of a new task feasible iff the resulting instance is feasible, or equivalently iff it creates a feasible insert state. 
Call an algorithm A an allocator iff it maintains a solution on any feasible task insertion. 
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8 Fixed window length 


In this cost model that does not distinguish between tasks of different lengths, an efficient allocator is impossible if tasks can 
have arbitrary lengths, and hence we shall look at only the case of unit-length tasks until the very end of this paper. This 
section will be restricted to the first variant where all the windows have the same fixed length c, and henceforth we can always 
assume that the instance is ordered. Our main result is that for one processor the maximum number of reallocations needed on 
each insertion for any e-slack instance is ©(i log(i)) as e —0. In the following subsections, we will establish a collection 
of useful algorithms and theorems, and then demonstrate a family of instances that show that the bound cannot be improved, 
and finally describe and prove an optimal allocator that realizes the bound while still having a good time complexity. 

8.1 Preliminaries 

A simple but very helpful theorem is that given any ordered instance I and solution S for /, sorting S (permuting the slots 
so that they are now in the same order as their windows) gives an ordered solution for I. As an easy consequence, given any 
valid allocation S for I, sorting any subsequence of S in-place (so that the slots in that subsequence are now in the same order 
as their windows) gives a valid allocation for I. Also, any feasible instance has an ordered solution. 

It is then clear that the greedy algorithm, which allocates the tasks from left to right each to the leftmost possible slot, produces 
an ordered solution L, such that given any ordered solution X, every slot in S is no later than the corresponding slot in X. 
Now given any feasible ordered insert state (/, S, r) with n tasks such that S is ordered, we can use L to construct a near 
ordered solution for (/, S, r), which is defined as an ordered solution N such that S'[r — 1] < [r] < 5'[r-f 1] (the inserted task 

is allocated to a slot within the range given by the neighbouring slots in the original ordered solution). 

Along the way, we also define a procedure Snap that takes as inputs a slot s and a window w and returns the slot within w that 
is closest to s. This procedure will be used later as well. 

The rest of this preliminary results section contain the precise statements and proofs of the above theorems. All of them can 
be extended without difficulty to the multi-processor case, unlike the results in the later sections. 

Theorem 1 (Ordering). The following are true; 

1. For any ordered instance I = {n,T,W) and solution S for I, S when sorted is an ordered solution for I. 

2. For any ordered instance I = {n,T,W) and valid allocation S for I and ordered sequence S with 5'[fc[l..m]] 

sorted is a valid allocation for /. 

Proof. We shall first prove (1). Take any ordered instance I = (n, T, W) and solution S for I. Let S' be S sorted. 

While S s' we shall iteratively modify S such that the following invariances hold after each step j: 

1. 5[l..i] = .5'[l..j]. 

2. 5 is a solution for I. 

Then after at most n steps 5'[l..n] = S'^[l..n] and hence S' is a solution for /. 

After step 0, Invariances 1,2 are trivially satisfied. At step j, let W[i\ be the earliest window in W such that f Then 
S[i] > = S[k] for some k G [i + \..n\ because — 1] = — 1] and S' is ordered. Also, i> j — \ because of 

Invariance 1. Swap and S[k]. S is still a solution for /, because before the swap W\i] < W\k\ and S'[t] > S[k]. After the 
swap, S'[t] = and hence = iS'^[l..i]. Therefore Invariances 1,2 are preserved. 

Now we shall prove (2). Take any ordered instance / = (n, T, W) and valid allocation S for I and ordered sequence k\l..m]. 
Let X be S' with S[fc[l..TO]] sorted, and let j[l..a] be the subsequence of k such that T[j[l..a]] are exactly the allocated tasks in 
S[fc[l..TO]]. Then S[j[l..a]] is a solution for/'= {a,T[j[l..a]],W[j[l..a]]) and Ai[j[l..a]] is S[j[l..a]] sorted. Thus Ai[j[l..a]] 
is an ordered solution forby (1), and hence 2f[A:[l..m]] is an ordered valid allocation for {m,T[k[l..rri]],W[k[l..rn]]). Since 
X does not have any overlapping slots, AT is a valid allocation for I. 

Remark. Ordering and its proof applies with no change to the p-processor case, because sorting does not affect the processor 
and position for each slot. Additionally, it is easy to see that any ordered solution can be made into a cyclic one, namely that 
it has exactly the same allocated slots but the allocated processor cycles with the window rank modulo p. 
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Procedure 2 (Leftmost). 


Implementation 

Procedure Leftmost( instance I = {n,T,W) ): 

Set S'pJ.end = —oo. 

For i from 1 up to n: 

Set start = max(W^[z].start,— l].end). 

Set S'[i].end = S'[i].start+1. 

Return S'[l..n]. 

Theorem 3 (Leftmost’s properties). Take any feasible ordered instance I. Let S = Leftmost (/). Then the following hold; 

1. S' is an ordered solution for /. 

2. For any ordered solution X for I, we have S[z] < X\i\ for any i G 

Proof. Consider any ordered solution X for I, and set S[0] = [0] = (—oo, — oo). We shall inductively prove that S[i] < X[i] 

for each i G For each i from 1 to n, either of the following cases hold; 

W [i] .start > S — 1] .end; 

Then S[i].start = kF[i].start < start, because X is a solution. 

W [i] .start < S — 1] .end; 

Then S[i].start = S[i — IJ.end <X[i — IJ.end < 2f[z].start, because S[i — 1] < X[i — 1] by induction and X is an 
ordered solution. 

In both cases S[z] <X[i]. Therefore by induction (2) follows. Also, for any i G [l..n], we have both S'[z].start > IL[z].start and 
S'[z].end < Ai[z].end < IF[z].end, and hence S'[z] C W[i\. Finally by construction ^[z].start > iS'[z — IJ.end for any z G [l..n], 
hence S is an ordered solution. Since such an X exists by Ordering (Theorem 1), (1) follows. 

Remark. Leftmost has an analogous version for multiple processors, which assigns for each task in order the leftmost possible 
(processor,slot) pair that does not overlap previous assignments, and has exactly the same properties, with an analogously 
modified proof. 

Procedure 4 (Near). 

Dependencies 

Leftmost (Procedure 2) 

Implementation 

Procedure Near( ordered insert state {I = {n,T,W),S,r) with ordered S ): 

Set L = Leftmost(/). 

Return ( L[r] > 5'[r—1] ? L : L[l..r — 1] •seq(S'[r — 1]) • S'[r + l..n] ). 

Definition 5 (Near ordered solution). Take any feasible ordered insert state (J = {n,T,W),S,r) with ordered S. Call N a 
near ordered solution for (/, S, r) iff all the following hold; 

“v' is an ordered solution for I. 

^ S'[r — 1] < A^[r] < + 1]. 

Theorem 6 (Near’s properties). Take any feasible ordered insert state (/ = {n,T,W), S,r) with ordered S. Let N = 
Near(/, S, r). Then W is a near ordered solution for (/, S', r). 

Proof. Let L = Leftmost(/). Either of the following cases hold; 

^ L\r\ > S[r —1]; 

By Leftmost’s properties (Theorem 3), L is an ordered solution for I and L[r].start = max(IF[r].start,L[r — Ij.end) 

< max(kF[r +1].start, S[r — IJ.end) < S[r +1].start, and hence S[r — 1] < L[r] < S[r +1]. Also, N = L. 

■0- L[r] < S[r —1]; 

Then L[r— IJ.end < L[r].start < S[r — 1].start, and hence N = L[l..r — 1] •seq(S[r — 1]) • S[r + 1..7z] is an ordered 
solution for I and S[r — 1] = W[r] < S[r+1]. 

Therefore in both cases N has the properties claimed. 

Remark. The multi-processor version of Near combines a Leftmost solution and a Rightmost solution (defined symmetrically 
to Leftmost) that agree on the processor for the inserted task to obtain a solution with the desired properties for similar reasons. 
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Procedure 7 (Snap). 

Implementation 

Procedure Snap( slot s , window w ): 

If s.start < w.start: 

Return ■u;.start+ [0,1]. 

If s.end > w.end: 

Return w.end +[—1,0]. 

Return s. 

Theorem 8 (Snap’s properties). Take any slot s and window w. Let r = Snap(s, w). Then the following properties hold: 

1. r Cw. 

2. r.start < max(r(;.start, s.start). 

3. r.end > min(ri;.end,s.end). 

Proof. All properties are obvious by construction. 

8.2 Lower bound 

RA is asymptotically optimal for an allocator that keeps the solution in order, but we can do much better if the solution does 
not have to be kept in order. But before we describe such an allocator, we will first present for any e G (0,1) an e-slack feasible 
insert state on which any allocator will take at least [log]^^£(i)J reallocations, which is 0(ilog(i)) as e 0. There are 
worse situations that need approximately double that, but the insert state given here is much easier to analyze, giving a lower 
bound on any allocator that can work on arbitrary feasible insert states. In this insert state there are windows in order 

and the inserted window is the earliest. Each window has the smallest non-negative start position possible, as constrained by 
the existence of a (1 + £)-partial solution. The partial solution is also ordered with the /e-th allocated slot at \k — 1, fcj. The 
details are given below. 

Theorem 9 (Lower Bound). Take any e G (0,1) and any c > I + 1- Then there is some feasible e-slack insert state with 
window length c such that any allocator A that solves it makes at least [logi_|_£(j)J reallocations. 

Proof. (In this proof we shall omit the derivation of purely algebraic inequalities involving £,c,n as they can be easily verified.) 
First let I = {n,T,W) where n= and W[i] =max(c,i(l + £)) + [—c,0] for each z G [l..n]. Then I is an e-slack instance, 

because it has a (1 + e)-solution E where E[i\ = [z — l,z](l + e) for each z G [l..n]. This is easy to check as follows. Firstly, 
E[\..n] are non-overlapping. Secondly, for any z G [l..n], E[i] C W[i] because: 

ii^[z].start = (z —l)(l + e) > max(0,z(l + e) — c) = VL[z].start. 

£^[z].end = z(l + e) < VF[z].end. 

Now let S'[z] = [i — l,z] for each i G [l..rz]. Then S' is a solution for /, since S[l..n] are non-overlapping, and for any i G [l..zz], 
S[z] C W[i] because: 

'v' S[z].start = z — 1 > max(0,z(l + e) — c) = VF[z].start. 

S[z].end = z < z(l + e) < VF[z].end. 

After an insertion into / of a new task t with window [0,c], the resulting insert state is feasible, since t can be allocated to 
[0,1] C [0, c], and for each z G [l..zz], r[z] can be allocated to [z, z +1] because: 

•A i> S[z].start > IF[z].start. 

z + 1 < max(c,z(l + £)) = )F[z].end. 

Now consider any allocator A that solves such an insert state. Let S^[0] be the slot that A will allocate t to, and S^[l..rz] be the 
slots that A will allocate T[l..n] to respectively. Let I = [log^+g)-!)]. Set fc[0] = 0. We shall construct fc[l..Z] iteratively such 
that the following invariances hold after each step j from 0 to (: 

1. A will reallocate r[fc[l..j]]. 

2. S'^[A:[0..j]].end is strictly increasing. 

3. S'^[A:[j]].end < c(l + e)-^. 
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After step 0, Invariances 1,2 are trivially satisfied, and Invariance 3 is satisfied because S'^[A:[0]].end = S'^[0].end< c< c(l + £)^. 
To construct fc[l.J], at step j from 1 to I construct k[j] as follows. Set m = — l]].endj. Then, by Invariance 3, 

S'[k[j — l]].end < c(l + ey~^ < c(l + < c(l + £)*°gi+e(?)“l = and hence m < = n. Thus 

{ : i € [0..m] } are slots that will be allocated by A, and hence S'^[i].end > to + 1 because: 

■O' maxjg[g S'^[i].end > to+ 1, since do not overlap and start no earlier than 0. 

■A iS^pJ-end < S'[k[j — l]].end < to+ 1, since S'[0..j — 1] is strictly increasing by Invariance 2. 

Now set k[j] G [l..m] such that S'^[A:[j]].end > m+ 1. We then verify that Invariances 1,2,3 are preserved. Firstly, A will 
reallocate r[fc[j]], because S'^[A:[j]].end > m > k[j] = ^[^[jJJ.end. Secondly, ^^[/^[jJJ.end > m + l = — l]].endj +1 > 

— l]].end. Thirdly, ^^[/^[jJJ.end< lF[fc[j]].end = max(c,fc[j](l + e)) < c(l + e)-^, because k[j] <m< — l]].end< 

c(l + by Invariance 3. 

Therefore A will reallocate T[fc[l..Z]] by Invariance 1, which are distinct because S'^[fc[l..Z]] are distinct by Invariance 2, and 
hence A will reallocate at least I = [log]^_|_£(i)J tasks. 

Remark. Lower Bound (Theorem 9) is tight for allocators that are required to work on any insert state, as we shall show in 
the subsequent section, but if an allocator is used on the system from the beginning, we do not know if it is possible to do 
better by avoiding such ‘bad’ insert states. 

8.3 Optimal single-processor fixed-window allocator 

We will now describe a single-processor allocator FA that can solve any e-slack feasible insert state using at most 
max^21ogj^^ 1 ^-1-6,14^ reallocations, which is 0(ilog(i)) as e —>■ 0 and hence FA is asymptotically optimal 

for general insert states. If FA is allowed to maintain an internal state, it would take only 0(^ log(i) log(n) + log(n)^) time 
on each feasible insertion, and will return failure in 0(log(n)) time on an infeasible insertion. 

We first give an outline of the main ideas behind FA’s insertion procedure: 

1. If the surrounding region (around the window of the task to be allocated) has enough empty space, we pack the slots in 
that region to create a gap into which we can squeeze one more slot. 

2. If the surrounding region has not enough empty space, we need to first ‘get’ to some region with sufficient empty space 
using two mechanisms: 

(a) Jumping: A jump allocates a task to a slot that had already been allocated to another task whose window is as far as 
possible in some direction, displacing the other task, which then has to be reallocated, potentially in the next jump. 

(b) Pushing: When jumps are not possible, we have to use pushes first, where pushing a slot in some direction is simply 
to shift it just barely enough to make space for the inserted task or the previous pushed slot. The pushed slot may 
overlap the slot of another task, which may then be reallocated in a subsequent push or jump. 

3. In both the pushing and packing phases, in order to guarantee that they do not reallocate tasks outside their windows, it 
is very important to have the slots involved be in the same order as their windows, which we achieve by swaps. 

4. We do not actually do the jumping, but just simulate it to see where we can ‘get’ to. Only after we have finished the hnal 
packing phase do we perform actual jumps from the inserted task to the gap created. 

5. If the inserted task is allocated correctly, at most 0(i) slots need to be pushed before jumping can be carried out. So in 
some cases it is necessary to make a second attempt to hnd such a slot for the inserted task if the hrst attempt takes too 
many reallocations. 

O 

6. We allow the region for the packing phase to contain up to | slots each, for two reasons: 

(a) It means that we will pack at most 0(j) slots. 

(b) When packing is impossible, the span of all the windows reached so far by the simulated jumping will be less than 
about (1 -f ^e) times the number of slots within it. Thus the existence of a (1-1- ej-solution makes the span grow by 
a factor of about (1 -f ^e) on each jump, which is the fastest possible in the worst case to within a constant factor. 

These ideas may sound simple, but it is extremely tricky to actually make them work. So now we shall give a step-by-step 
high-level description of FA’s insertion procedure. 
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FA keeps the windows W in order in an IDSetRQ, so that it can query for any range of consecutive windows their maximum 
(start point — rank within the range) and their minimum (end point — rank within the range). For any inserted task with win¬ 
dow w, FA can easily determine the rank r of w if it is inserted into W, and can then by the following obtain in C)(log(n)) 
time the interval [istart, tend] such that for any unit interval s, there is an ordered solution that allocates the new task to s if 
and only if s is contained within [istart^iend]\ 

4- istart = (max(start point — rank) over windows of rank < r) + r 
4 iend = (min(end point — rank) over windows of rank >r) + r 

This means that FA can easily check whether the insertion is feasible, because by the Ordering theorem any instance is feasible 
if and only if it has an ordered solution, which is equivalent to iend —istart > 1. 

FA also keeps (slot, window) pairs in an IDSetRQ in order of the slots, so that it can query for any range of consecutive slots 
their tasks’ earliest window and latest window. This is a key ingredient for an efficient implementation of the jumping part of 
the algorithm, where each jump allocates a task to a slot that had already been allocated to another task whose window is as 
far as possible in some direction, displacing the other task, which then has to be reallocated. A sequence of jumps will cause 
a cascade of reallocations that propagate as fast as possible, so that space usage can be efficiently reorganized. 

If FA knew what £ was, it could just run the appropriate subroutine: 

4 c> 1+4: LargeWindow 
4 c < I + 4: SmallWindow 

LargeWindow is named thus because the window length is large enough to ensure that jumping can begin immediately. It 
simulates jumps from the inserted task’s window to the furthest windows in both directions simultaneously on each jump. It 
stops when it finds | or fewer consecutive allocated slots in an interval within the current span with a total empty space of at 
least 1, where the current span is defined as the span of all windows reached by the jumps. Then it sorts those slots and packs 
them greedily to create a unit gap in the middle. Finally it performs actual jumps from the inserted window to the unit gap, 
which solves the instance. 

To find such a set of slots, FA divides the intervening gaps (touching slots are considered to have a gap of length 0 in-between) 
into blocks of (| +1) consecutive gaps each, except the last block, and uses the IDSetRQ containing the (slot, window) pairs 
to determine in C)(log(n)) time if the blocks have average total space at least 1. If so, FA uses a binary search to find a block 
with at least average total space in 0(log(n)^) time using the same IDSetRQ. Such a block will have total space at least 1. 

SmallWindow on the other hand may need to use a number of pushes before jumping is even possible, where a push reallocates 
a task by shifting its slot just enough to fix an overlap. SmallWindow may need to make two attempts. Letting X be the partial 
solution sorted, it separately tries two positions for the inserted slot s: 

4 Within [istart, iend] and nearest to X[r — 1] +1 
4 Within [istart, iend] and nearest to vA[r +1] — 1 

One of them is guaranteed to succeed due to the restrictions that any (1 + £)-solution for the previous instance place on the 
subsequent steps, but it is not clear how it can be determined efficiently without even knowing e. Thus SmallWindow simply 
tries both. In each case, SmallWindow sorts the allocated slots that are within the inserted window, and now s ‘separates’ the 
allocated slots, namely that there is no slot that is after s but with earlier window, or before s but with later window, because 
X[r—1] < s < AT [r +1]. SmallWindow then pushes m neighbouring slots aside on each side as necessary, for each of them 
after swapping it into the window with the same rank, so that each of the pushed slots also ‘separate’ the allocated slots. 

On each side separately, if the last pushed slot still overlaps the next one, the next one is deallocated and jump simulation is 
begun in the pushing direction with its window as the initial window. For the right side, the current span is defined to start at 
the end of the last pushed slot, and for the left side defined to end at the start of the last pushed slot. Everything else follows 
LargeWindow exactly. 

Both LargeWindow and SmallWindow make only 0(|-log(i)) jumps due to the following reasons. Firstly, on each jump 
the current span grows by a factor of at least about 1 + § when it is sufficiently large, because the average space per block 
of (| + 1) gaps within it must be less than 1, making the slots within the current span cramped. Secondly, the existence of 
a (1 + £)-solution for the previous instance forces the current span to grow to roughly proportional to that number of slots. 
For LargeWindow, the starting span is already large enough. For SmallWindow, it is enough that the starting span just be at 
least 2 + £, which is ensured by the pushing phase. In either procedure, the current span cannot increase by more than c — 1 
on each jump, and hence the number of jumps ends up being O^log^^i^('i)^ for LargeWindow and 0^1og^^i^(|)^ for 
SmallWindow. 
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The problem is that it is probably impossible to determine e exactly in 0(ilog(i)log(n)) time, so FA uses the following 
standard trick to avoid having to know e at all. On a feasible insertion, FA starts by setting e = 2, and assumes that e = 
e in order to run the appropriate procedure as before, but limiting the number of jumps to the maximum it should be. If it 
fails, it must be that e < e, so FA halves e and tries again, repeating until | > n, at which point both LargeWindow and 
SmallWindow would definitely succeed. Each failed trial takes 0(i log(i) log(n)) time and hence all the failed trials take 
0(i log( j)log(n)) time by a simple summation. The one successful trial takes 0(ilog('i)log(n) + log(n)^) time. 

Algorithm 10 (FA). 

Dependencies 

LargeWindow (Subroutine 11.2) 

SmallWindow (Subroutine 11.4) 

Variables 

Ordered instance I = {n,T,W) // current ordered instance ; must be feasible before and after each operation 
Allocation S // current allocation for I ; must be a solution for I before and after each operation 

Initialization 

Set/ = (n,T,W) = (0,0,0) and .5=0. 

Initialize an IDSetRQ for W, with the range query function (maximum(start —rank),minimum(end —rank)). 
Initialize an IDSetRQ for (S', W) sorted by S, with range query function (earliest window,latest window). 

External Interface 

Procedure Insert) task t, window w) // inserts task t with window w into the system 
Procedure Delete) task f) // deletes task t from the system 

Implementation 

Procedure Insert) task t, window w ): 

// Create the insert state // 

Backup I,S. 

Set (/ = {n,T,W),S,r) to be the ordered insert state on insertion of {t,w) into (/,S). 

// Find the range of possible insertion points in an ordered solution for I // 

Set istart = maXjg[]^ ^](VF[i].start —i) + r. 

Set lend = min^gj^ „](VF[i].end— (i — r)). 

// Check if the insertion is feasible // 

If lend —istart < 1: 

Restore I,S. 

Return Failure. 

// Perform doubling on m from 1 up // 

Backup S. 

For m doubling from 1 up to 2n: 

Restore S. 

// Set e = ^ and assume e = e and use the appropriate procedure based on c and e // 

Set e = 

Ifc> 1+4: 

If LargeWindow( (1,5', r),m) = Success: 

Return Success. 

Otherwise: 

If SmallWindow((/,S',r),m,isfarf,ien(i) = Success: 

Return Success. 

// This will never be reached // 

Procedure Delete) task t ): 

If f e T: 

Delete t from (/, S). 

Return Success. 

Otherwise: 

Return Failure. 
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Theorem 11 (FA’s properties). On an insertion of a new task, if the current instance is e-slack, FA has the following proper¬ 
ties: 


If the insertion is feasible, it returns Success after updating S' to be a solution for the new instance by making at most 
max -|-^-|-6,14^ reallocations and taking 0 (|log(|)log(n)-l-log(n)^) time. 

■O' If the insertion is not feasible, it returns Failure in 0(log(n)) time. 


Proof. By Insertion Range (Lemma 11. 1), on such an insertion [istarf, iend] will be set such that any slot is within that range 
iff the new task can be allocated to it in some ordered solution. This implies that iend — istart > 1 iff the insertion is feasible. 
Thus FA will restore the previous state and return Failure iff the insertion is not feasible. Also, this feasibility check takes 
0(log(n)) time as it only needs one range associative query to obtain istart, iend. If the insertion is feasible, FA will enter 
the for-loop, which by iteratively doubling m and halving e will ensure that a reasonably good solution will be found and yet 
only the last iteration takes a significant portion of the total time. The reason is that both LargeWindow (Subroutine 11.2) 
and SmallWindow (Subroutine 11.4) have the following properties (Lemma 11.3,Lemma 11.5) when called here under the 
respective conditions c > | -I- 4 and c < | -f 4: 


If m > n or e < £, it will solve the insert state using at most ^2 max^log 

return Success in 0(|log(|)log(n)-|-log(n)^) time. 

If it returns Failure, it would have taken 0(^ log(i) log(n)) time. 


l-t-ir 




+ ^ + 6] reallocations and 


Since the while-loop will run at least once with m G [n, 2n) at the start of the loop, FA will always return Success in some 
run of the loop and never exit the loop otherwise. Since i log(i) more than doubles when e is halved, and e = 2ore>^£ 

[max(log^^ 1 ,0)J + +6 < max(21og^^i (y) + f + 6 ,m) 


in the successful loop, we obtain 2 


and the total 


time taken is 0 (|- log(|)log(n) -|~log(n)^) as £ —:> 0 . 

Lemma 11.1 (Insertion Range). Take any ordered insert state (/ = (n,T, W),S,r), and define istart,iend as follows: 


istart = maxjg[;^ ^](fF[i].start —i-l-r). 
iend = min^gj^ ^](VF[i].end —i-l-r). 

Take any slot s. Then s C [istart,iend] iff s = Y[r] for some ordered solution Y for I. 

Proof. Firstly if s C [istart, iend], then let A be S' sorted, which by Ordering (Theorem 1) is a partial solution for {I,r), and 
let Y be as follows: 


•0- Y[r] = s. 

■Y' = min(A[z], s — r-l-i) for each i G [l..r — 1 ]. 

■<>- Y[i]= max(A[z], s — r-|-z) for each i G [r + l..n]. 

Then Y is an ordered solution for I, which we can check as follows: 

Y allocates each task to a slot within its window. 

■O’ Y[r] C [istart,iend] C W[r]. 

■O’ Y[i] C [min(A[z].start,zsfarf — r-|-z),A[z].end] C W[i] for each i G [l..r — 1]. 

■O’ Y[i] C [A[z].start,max(A[zj.end,zend — r-I-z)] C W[i] for each z G [r-|- 1 .. 7 z]. 

■O’ Y allocates different tasks to non-overlapping slots. 

■O’ y[z].end = min(A[z].end,s.end — r-l-z) < min(A[z-l-1].start, s.start — r-f (z-|-1)) = A[z].start for each i G [l..r — 2]. 
■O’ y[z].start = max(A[z].start, s.start —r-|-z) > max(A[z — Ij.end, s.end — r-f (z — 1)) = A[z].end for each i G [r-|-2..rz]. 
■O’ y[r — 1].end < s.end—1 = A[r].start. 

■O’ y[r-I-1].start > s.start-f 1 = A[r].end. 

Conversely if s = Z[r] for some ordered solution Z for I, then s C [istart,iend], which we can check as follows: 

■O’ s.start = A[r].start = maxjg[;^ ^](Z[z].start —r-l-z) > maxjgj]^ ^](VF[z].start —r-|-z) = istart. 

-V- s.end = A[r].end = min^gj^ „j(A[z].end —r-l-z) < mmjg[^ ^](kF[z].end —r-l-z) = iend. 

Therefore the desired equivalence follows. 
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Remark. This technique to find the possible range for Y\r\ in an ordered solution Y can be extended to the p-processor case 
to take 0(plog(n)) time per operation with judicious use of data structures, since for any ordered solution the cyclic one has 
exactly the same slots, and the range query in the IDSetRQ can be suitably modified to compute the values for each set of 
windows with the same rank modulo p. 


Subroutine 11.2 (LargeWindow). 


Dependencies 

Jump (Subroutine 11.8) 

Implementation 

Subroutine LargeWindow( insert state (/ = (n,T, W),S,r) , nat m = | ): 

// Jump in both directions // 

Set jumps = max(^log^_|_ig(^),0^ +1. 

Return Jump((/, S,r),m,W[r],“both”, jumps). 

Lemma 11.3 (LargeWindow’s properties). Take any e-slack ordered insert state (/, S, r) and positive natural m = |. Then 
LargeWindow((/,S',r),TO) does the following: 


■O' If c > I -f 4 and ( m > n or e < e ), it solves (/, S', r) using at most max 

returns Success in 0(ilog(|)log(n)-|-log(n)^) time. 

O' If it returns Failure, it would have taken 0(| log(i) log(n)) time. 


(logl+ie(f)>0 


-f I -f 1 reallocations and 


Proof. The lemma follows directly from Jump’s properties (Lemma 11.9 Properties 1,2), where d = 7. It only needs to be 
checked that c > | -I- 3 -f which follows from > y(|-|-4) > | + 3. 

Subroutine 11.4 (SmallWindow). 


Dependencies 

Push (Subroutine 11.6) 

Implementation 

Subroutine SmallWindow( insert state (/ = (n,T, W),S,r), nat m=| , real istart, real iend ): 

Define to be S sorted with (X[0].end, X[n-f 1].start) = (—oo, oo). 

Set a = Snap(X[r-|-1] — 1, [istart,iend]). 

Set b = Snap(X[r — 1] -f 1, [istart,iend]). 

Backup I,S. 

If Push((/,S,r),m,a) = Success: 

Return Success. 

Restore I, S. 

If Push( (/,S,r),m,&) = Success: 

Return Success. 

Restore I, S. 

Return Failure. 
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Lemma 11.5 (SmallWindow’s properties). Take any e-slack ordered insert state {I,S,r) and positive natural m = ^ and 
reals istart,iend. Then SmallWindow((/,S',r),m,*start,ien(i) does the following: 


■O' If c < ^ -1-4 and { m>n or e < t ) and ( s C [i start, iend] iff s = y[r] for some ordered solution Y for I ) for 
any slot s, it solves {I,S,r) using at most 2 max^logj^_|_i^^if^ ,0^J +^ + 6 reallocations and returns Success in 

0(|log(^)log(n)-|-log(n)^) time. 

O' If it returns Failure, it would have taken 0(^ log(i) log(n)) time. 


Proof. The short summary is that Push( {I,S,r),m,s) succeeds with the right choice of s if / is feasible, and it turns out that 
it is enough to try just a and b specified in the subroutine. 


Let E be some ordered (1-he)-partial solution for (/,r). Then istart < £'[r-|-l].start, otherwise there is some ordered solution 
Y for / such that F[r].start > 1].start and so concatenating (.start-f [0,1] : i € [l..r — 1]), (^^[r-l-1].start-f [0,1]) 

and F[r-|- l..n] gives an ordered solution for I that implies istart < i?[r-f 1].start. Likewise iend > i?[r — 1].end. 

Next let N be some near ordered solution for {I,S,r) by Near’s properties (Theorem 6 ). Then istart < A^[r].start < 
X\r -f 1].start and iend > iV[r].end > X[r — Ij.end. From these we get X\r — 1] < a, & < X[r -f 1] because of the follow¬ 
ing inequalities arising from Snap’s properties (Theorem 8 ): 


<>• a < max(X[r-|- 1 ] — [ 0 , 1 ]) < X[r-|- 1 ]. 

■O' a > min(X[r-|- 1 ] — 1, iend-f [—1,0]) > X[r — 1]. 

O' 5 < max(X[r — 1] + l,istart+[0, 1]) < X[r-|-1]. 

O' 5 > min(X[r — 1] + l,iend+ [—1,0]) > X[r— 1]. 

Also, 6 .end — a.start < max(X[r — Ij.end, istart) — min(X[r + 1].start, iend) + 2 < 2, which gives 
{E[r -f Ij.end — &.end) -f (a.start — E[r — 1].start) > 2(1 + e) — 2 = 2e and hence either E[r + Ij.end — 6 .end > e 
or a.start — i?[r — Ij.start > £. In addition, if a overlaps Ar[r-f 1], then a.start = isfarf < i?[r-f Ij.start and so 
E[r+ Ij.end — a.end > e. Likewise, if b overlaps X[r — 1], then 6 .start — iii[r — 1].start > e. Together they imply that for at 
least one s € {a, 6 } all the following hold: 

O' If s overlaps X\r + 1\, then i5[r-f Ij.end — s.end > £. 

O' If s overlaps AT[r—1], then s.start— i?[r — Ij.start > £. 


Therefore by Push’s properties (Lemma 1 1.7), this lemma follows directly. 
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Subroutine 11.6 (Push). 

Dependencies 

Jump (Subroutine 11.8) 

Implementation 

Subroutine Push( insert state (/ = {n,T,W),S,r) , nat m=| , slot s ): 

Define X to be S' sorted with (X[0],end,X[n +1].start) = (— 00 , 00 ). 

// Sort the slots within W[r] 11 

Let u[l..q\ be an ordered sequence such that {S[u[i]] : i G [l-.g] } = {S[i] : i G [l..n]\{r} AS[i] C W[r\ }. 

Sort S[ri[l..g]]. 

// Allocate the inserted task to s // 

Set S[r] = s. 

11 Handle overlapping slots on both sides // 

Sel jumps = max^logj^_|_ig^i|j,0^ +1. 

For i from r +1 up to n as long as S — 1] overlaps X [i]: 

Set i' G [i..n] such that S[/] = X[i\. 

// Jump if pushing m slots is insufficient // 

If i = r+ 771+1: 

Set = null. 

If Jump((/,S',7'),m, [S'[r+ 77i].end, (+[ 7 '].end],“right”, juTTips) = Failure: 

Return Failure. 

Exit For. 

// Swap into place and push aside the neighbouring slot // 

Swap S'[7],S'[z']. 

Set ^'[i] = S[i — 1] +1. 

For i from r — 1 down to 1 as long as S'[7 +1] overlaps X[7]: 

Set i' G [l..z] such that S')/] = X(z]. 

// Jump if pushing m slots is insufficient // 

If 7 = r —771— 1: 

Set S]/] = null. 

If Jump((/,S,/),r77, [IF(/].start,S[r — 777 ].start],“left”, juTTips) = Failure: 

Return Failure. 

Exit For. 

// Swap into place and push aside the neighbouring slot // 

Swap S[7],S(/]. 

Set S[i] = S[7 +1] — 1. 

Return Success. 

Lemma 11.7 (Push’s properties). Take any e-underallocated ordered insert state (7 = {n,T,W),S,r) and positive natural 
777 = Let X be S sorted with (X[0].end,X [77 +1].start) = (— 00 , 00 ). Take any ordered (1 + £)-partial solution E for {I,r) 
and slot s such that all the following hold: 

^ X[r-1] <s<X[r + l]. 

s = y [r] for some ordered solution Y for I. 

■O' If s overlaps X[r +1], then i?[r +1].end — s.end > £. 

O' If s overlaps X[r—1], then s.start— i?[r — 1].start > £. 

Then Push((/,S',r), 777 , 5 ) does the following: 

O' If c < I + 4 and ( m > 77 or e < £ ), it solves (/,S,r) using at most 2 
returns Success in 0(ilog(|)log(77) + log(77)^) time. 

O' If it returns Failure, it would have taken 0(| log(i) log(r7)) time. 

Proof. We shall divide the proof according to the parts of the subroutine. 


maxflog^^i (^),0) +^+6 


reallocations and 
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High-level overview 

Push sorts all the slots in S that are within the inserted window, which ensures that all slots in earlier windows are before s 
and all slots in later windows are after s. Subsequently, it allocates the inserted task to s, which may cause overlaps on both 
sides. To hx that, it goes through up to m slots on each side of s, for each of them using a swap to align it with X and then 
pushing it aside so that it no longer overlaps the previous slot. If after m slots the last pushed slot S'[i] still overlaps the next 
one it deallocates 5'[j] and executes Jump with starting window W\j] and starting span the subinterval of W\j] that is 
beyond ^[i]. If the stipulated conditions are met, the starting span will be large enough for Jump to succeed. 

Symmetry 

It suffices to analyze the hrst for-loop since the second for-loop and all the lemma conditions are symmetrical about r. 

Sorting part 

First we shall consider the situation just after sorting the slots within W[r\. Since the number of windows before W[r] is the 
same as the number of slots before X\r\, we have: 

#({* : i e [l..r — 1] AS'[t] > X[r —1]}) 

= (r —!) — #({i : i G [l..r — 1] AS'[z] <X[r—1]}) 

= #({i : i G [l..n]\{r} A5'[i] < X[r — 1] — : i G [l..r — 1] A5[i] < X[r — 1] }) 

= #({t : i G [r + l..n] AS'[i] < X[r —1]}). 

But for any i G [l..r — 1] and j G [r + l..n] such that S'[j] > X[r— 1] and S[j] < X[r— 1], we would have S[j] < S'[j] and 
^[ij.end < VF[i].end < VF[r].end and S'[j].start > VF[j].start > VF[r].start, which imply C W[r] and hence contradict 

the fact that S'[tt[l..( 7 ]] is sorted. Therefore we must have the counting identity #({i : i G [l..r — 1] AS'[j] > X[r — 1] }) = 
#({* : i G [r + l..n] AS'[t] < X[r — 1] }) = 0. 

Pushing part 

Next we shall prove that for the hrst for-loop in the pushing part the following invariances hold before each iteration: 


1. #({j : jG[l..z-l]A5[j]>X[z-l]}) = 0. 

2- <S'[j] = X[j] = S'M + (j - r) for any j G [r..i - 1]. 

3. S'[z^] = X[i\ for some i' G [i..n]. 

4. S'[r..n] is a valid allocation of T[r'..n] except possibly that S'[z — 1] overlaps X[i\. 

Invariance 1 holds by Invariance 2, which holds by construction, and thus Invariance 3 holds since either X[i] > X[i — 1] or 
2f[i] = [i — 1] = — 1] > for any j G —2]. Invariance 4 follows from Invariance 3, because the swap will not cause 

any additional violation of allocation validity by choice of s, and because S'[f] will be shifted to — 1] + 1 — X[i — l]-\-l < 
2l [i -f 1] and so will at most overlap [z +1]. 

Jumping part 

Finally if Jump is executed, m <n otherwise all slots would have been pushed, and thus the following inequalities hold: 

£’[r-|-m + l].end — 5'[r-|-TO].end > +1].end + to( 1 + e)) — (iS'[r].end + TO) 

= (i?[r + l].end —5'[r].end)+TO-£ > (to + 1)£ [by Invariance 1]. 

(to + 1)£-1 = (| + 1)£-1>| + £> max^l + ^£, 

And so the lemma follows quite directly from Jump’s properties (Lemma 11.9 Properties 3,4), where d = 7. It should be noted 
though that the state {I,S,r) we are passing to Jump is strictly speaking not an insert state, but it does not matter. To verify 
that it still works, just prior to calling Jump we can pretend modify — 1] to make it an insert state due to the counting 

identity and the fact that s = Y\r\ for some ordered solution Y for I, and then Jump’s properties hold. Afterward we just undo 
our pretend modihcation, which does not interfere with Jump as a consequence of Property 4. 


Total costs 

The number of reallocations made by the sorting and pushing parts is bounded by max(c, to) + 3to < ^ + 4, and hence 
the total number of reallocations is at most 2 max^log^^^ i ^ taken by sorting and pushing is 

obviously 0((c+ 2 to) log(n)) C 0(2 log(n)), which is dominated by the time taken by Jump. 
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Subroutine 11.8 (Jump). 


Implementation 

Subroutine Jump( insert state {I = {n,T,W),S,r) , nat m=| , interval U , string dir , nat jumps ): 

Set V = U. 


For jmp from 0 up to jumps: 

// Enter the final phase if V has sufficient empty space // 

Subroutine count) interval w ): 

Return #({t : i G [l..n]\{r} AS'[i] C w}). 

Subroutine space) interval w ): 

Set sstart = max(u;.start,maxj.jg[i j,tart<«;.start'S'[*]-end). 

Set send = min(u;.end,minj.ig[l endS'[i].start). 

Return {send — sstart) — count(rt;). 


Set blocks = 


count(t/)+l 

m+1 


If space(tG) > blocks: 

// Find a block of at most m + 1 gaps within V having empty space at least 1 // 
Binary search to find interval R such that all the following hold: 


RCV. 


“v* count(i?) < m. 
space(i?) > 1. 

S'[i] C i? for any S'[i] that overlaps R. 

// Sort the slots within R // 

Let q = count (i?). 

Let t 6 [l..( 7 ] be an ordered sequence such that : i € } = {* : * G [l..n]\?’AS'[f] C _R}. 

Sort S'['u[l..<;]]. 

// Pack the slots aside within R to leave a gap G of length 1 // 

Set i' = q + l. 

SetG = i?.end+[-l,0]. 

For i from 1 up to g: 

Set X = max(i?.start+ (i — l),S'[u[i]].start — 1). 

If a; < fL[u[i]].start: 

Set i' = i. 

SetG = x+[0,l]. 

Exit for. 

Set = a;+ [0,1]. 

For i from i' up to q: 

Set X = min(i?.end— (g —i),S'['u[i]].end+l). 

Set = a;+ [—1,0], 

// Reallocate along cascade from t+[r] to G // 

If dir = “both”: 


Set dir = ( G.start < (7.start ? “left” : “right” ). 

Set j = r. 

While G ^ W[j]: 

Deiine j slots = {i : iG [l..n]\{r} AS'[i] GW[j]}. 
Set j' = ( dir = “left” ? min(js)ofs) : max(js/ofs) ). 
Set S[j]=S[f]. 

Set j = /. 

Set S[j] = G. 

Return Success. 
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// Jump to the furthest window[s] whose slot is within V // 

Define j slots = {i : i € [l..n]\{r} AS'[i] Cl/}. 

If j slots = 0: 

Return Failure. 

Set rstart = ll/[min(js/o<s)].start. 

Set rend = fF[max(jslofs)].end. 

If dir = “both”: 

Set V = [min(rstarf, 1/.start),max(ren(i,l/.end)]. 

Otherwise: 

Set 1/ = ( dir = “left” ? [rstart,V .end] : [f/.start,rend] ). 

// Return failure if too many jumps are used // 

Return Failure. 

Lemma 11.9 (Jump’s properties). Take any ordered insert state {I,S,r) and positive natural to = | and real e > e. Take also 
any interval U C W[r] and dir € {“left”, “right”,“both”} and natural juTO-ps. Then Jump((I,S,r),m,l7,dir,jumps) has the 
following properties: 

1. It solves (/, S', r) using at most to reallocations if all the following hold: 

“O' TO > n. 

4- span([/) > n+ 1 . 

4 jumps > 1. 

2. It solves (/,S, r) using at most {jumps + m) reallocations if for some d > 0 all the following hold: 

4 (/, S, r) is £-underallocated. 

4 U = W[r]. 

4 dir = “both”. 

4 c>l+3+C^. 

4 jumps > max^logj^_l_i^(^),0^+1. 

3. It solves (/,S,r) using at most {jumps + m) reallocations if for some d > 0 either of the following holds: 

4 For some r' € [l..r — 1] and ordered (1 + ej-partial solution E for {I,r —m), all the following hold: 

4 #({ i : i€ [l..r'] AS[z] > S[r'] }) = 0. 

4 (7 = [S[/].end,ll/[r].end]. 

4 dir = “righf’. 

4 E[r + IJ.end — S[/].end — 1 > max^l + ^e, ■ 

4 jumps >max(^log^_l_i^(^P^,0^+1. 

4 For some / S [r + l..n] and ordered (1 + £)-partial solution i? for (/,r^ + to), all the following hold: 

4 #({ i : i€ [r'..n] AS[z] < S[r'] }) = 0. 

4 (7 = [kl/[r].start,S[/].start]. 

4 dir = “left”. 

4 E[r' — 1].start — S[/].start +1 < — max^l + ^t, ■ 

4 jumps >max(^log^_l_i^(^P^,0^+1. 

4. If it solves {I,S,r), it also satisfies all the following: 

4 It performs reallocations to slots completely after (7.start if dir = “right” and before (7.end if dir = “left”. 

4 It returns S'uccess in 0(i log(i) log(n)+log(n)^) time. 

5. If it returns Eailure, it would have taken 0{jumps ■ log(n)) time. 

Proof. The proof is divided roughly according to the parts of the subroutine. 
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High-level overview 

Here is a high-level sketch of what Jump does. In the initial phase, k will be the current window index, and V will be the 
current span. {k,V) are initialized based on the input parameters, and jumps will be made only in the specified direction, each 
time to the furthest window that has a slot completely within the current span. After each jump V will be extended in that 
direction to match the reach of the new window. The final phase is entered when the current span has sufficient total empty 
space to guarantee that it contains a region R containing at most m slots and having empty space of at least 1, upon which the 
slots within R will be sorted and packed within R to leave a unit gap G, and then the solution is finished by reallocating the 
slots on the cascade of jumps from W\r\ to G. 

Jump succeeds in the specified situations roughly because of the following high-level reasons. The fact that the original 
instance is e-slack before insertion implies that the windows stretch out at an average rate of at least about 1 + e, while in the 
initial phase the slots stretch out at an average rate of at most about l + ^ = l + Soife<£, the current span V grows more 
or less exponentially. All that the proof really depends on is that m S 0(^) and e — ^ G 0(£), so the choice of m = | is just 
to make the computations simple instead of attempting to obtain optimal bounds. 

Jumping sequence 

Before getting to the proof, here is the precise definition of jumps. Call v a jumping sequence iff the following hold: 

■O' is a strictly monotonic sequence of indices. 

O' S'[t;[2..#(z;)]] is a strictly monotonic sequence of slots. 

O' 'c.first = r. 

O' Vk[u[i]] 3 1]] for any i G [1..#(?;) — 1]. 

Additionally, we say that v covers a point a; iff a; G span( Vk[u]). Note that a jumping sequence is allowed to have jumps that 
are not to the furthest possible window. It is quite clear from the construction of Jump that before any iteration of the main 
for-Ioop, for any xGV there is some jumping sequence v of length at most jmp+1 that covers x. Note that the only condition 
on U needed for this is U C kk[r], and it is sufficient to guarantee that a solution will be found if the final phase is entered. 

Failure characteristics 


Clearly each invocation of count and space takes 0(log(n)) time using only a few search queries to the IDSetRQ. Also, Jump 
does not compute the whole set jslots but just its minimum and maximum, which takes only two search queries and one range 
query to the IDSetRQ, which amounts to 0(log(n)) time for each jump. Therefore if Jump returns failure it would have taken 
only 0{jumps ■ log(n)) time in total. 

Final phase 

If span(?7) >n + l and m>n, then in the very first iteration of the for-Ioop over jmp we have blocks < \< 1 and 
space(C) > span(C) — n = span([/) — n > 1, and hence the final phase will be entered, in which the sorting part and U C 
W\r\ ensures that the packing part succeeds, giving Property 1. 


Searching part 


If the final phase is entered, the binary search can be carried out, because the intervening gaps between the slots within V can 

count(V)+l 
m+1 


be partitioned into 


blocks of m-f 1 or less, and space is additive on intervals, and so the binary search can keep 


halving the current set of consecutive blocks by choosing the half that has the larger average space per block, which ensures 
that it will obtain a single block R with total space at least 1. It is also easy to guarantee that 5'[z] C i? for any 5'[z] that overlaps 
R, by trimming R to exclude any slots that cross its boundary. The binary search takes 0(log(blocks)log(n)) C 0(log(n)^) 
time. 


Sorting part 

S'[it[l..q]], which are the slots within R, is then sorted, which takes 0(m-log(n)) C 0(ilog(n)) time, and by Ordering 
(Theorem 1) S' remains a partial solution for I. Let Sq be the original S before the sorting. Take any shortest increasing 
jumping sequence vq that covers i?.end before the sorting. After sorting, vq may be no longer a jumping sequence, but we can 
create a new jumping sequence v that covers ii.end by modifying vq. 

For each i just before j in vq, we have S[t].end < S[j].end < i?.end by minimality of vq. If S[j] ^ R, we have 
S[j].start < ii.start and hence S[i],S[j] are unchanged, so we do nothing. If S[j] C R, we insert each element in 
A = {a : a € u[l..q\Ai < a < j } into v such that v remains strictly increasing. 
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Now consider any 6 € Au{j} and let a be just before 6 in w. If a € u[l..q], we have 5'[a] < S'[ 6 ] because 5'[M[l..g]] is ordered. 
\f a^u\l..q\, it must be that a = i and so S'[a] .end < i?.end, which gives S'[a] .start < i?.start since S'[a] ^ R, and hence S'[a] < 
S[b]. Therefore in all cases S[a\ < S'[ 6 ]. 

Also, since the number of windows with slots before 5'[&] within R is the same as the number of windows before W\b] with 
slots within R, we have; 

#({fc : /c G M[l..g] Afc > 6 AS'o[fc] < S'[ 6 ] }) 

= #({A: : k G u[l..<;] AS'o[fc] < S'[ 6 ] }) — #({/c : k G u[l..q]Ak < 6 AS'o[fc] < <S'[ 6 ]}) 

= ^{{k : k € u[l..q] Ak < b}) — ^{{ k : fc G u[l..( 7 ] A/c < 6 AS'o[A:] < S'[ 6 ] }) 

= #({k : k € u[l..q]Ak < 5AS'o[A:] > S'[5] }). 

And so if Vl^[a].end < S'[ 6 ].end, for any k G M[l..g] such that k <bwe have k <a and so S'o[fc].end < iy[A:].end < iy[a].end < 
S'[ 6 ].end, and hence fF[a].end> flG[r].end> S'o[j].end > 5'[6].end since j G M[l..g] and j > b. Therefore in all cases VF[a].end > 
S'[ 6 ]. end. 

Therefore, after all the insertions, v is once more an increasing jumping sequence that covers S.end. Also, all the elements 
added to v are in M[l..g]. 

Packing part 

Next is the packing part, in which the first for-loop does left-packing and the second for-loop does right-packing. Call each 
packing successful iff S is still a valid allocation after it is completed. First note that since S[M[l..g]] is ordered, by induction 
i?.start-b (i — 1) < S[tt[i]].start for any i G [C.g], and likewise i?.start — (q — i) > S'[u[i]].end for any i G [l..( 7 ]. Thus the 
first for-loop sets x < S'start and hence sets S[u[t]] C VF[m[z]], for each i G [1../ — 1]. Also, it sets S[m[ 1../ — 1]] to 
non-overlapping slots, because max(i?.start-b (i — 1), S[M[i]].start — 1)-b 1 < max(i?.start-bt,S[M[z-b 1]]. start—1) for any 
i G [l.-g- 1]. Since S[w[l..i^— 1]] are also all still within R and not shifted right, they do not overlap other slots in S, and 
hence the left-packing always succeeds. Likewise, the right-packing succeeds if the second for-loop always sets S'[u[t]].end < 
lL[u[t]].end for each i G [i'--q], which is what we shall now prove. 

If the right-packing fails and sets S'[r].end > kF[r].end for some i G u[i' ..q], then it suffices to consider the case that i>r 
because this asymmetric part works if and only if the symmetric version that tries both directions works. Basically, since the 
symmetric version guarantees some way of packing that works, the asymmetric version left-packs at least as many slots as the 
symmetric version, and hence can definitely right-pack the rest. Anyway this is inconsequential and so details are omitted. 

Let u be a shortest increasing jumping sequence that covers i?.end, which exists by the earlier argument because i?.end > 
S' [z].end > VL[z].end > VL[r].end. Then u.first <i < u.last since kL[r;.last].end > i?.end > S[z].end > kL[z].end. Let h be just 
before j in v such that h <i < j, and let Si be the original S before packing. Then Si[j].start > FL[j].start > kL[z].start > 
kL[u[/]].start > S.start and Si[j].end < Si [u. last].end < i?.end by minimality of v, and hence Si [j] C R. Thus Si [z] <5'ib1 
since Si[zz[l..( 7 ]] is ordered. This gives S[z].end < Si[z].end+ 1 < Si[j].end < kL[/i].end < VL[z].end, which contradicts the 
failure of the right-packing. 

Therefore the packing always succeeds, and sets G such that G C R and G does not overlap any slot in S, because either i' < 
q and G.end = max(i?.start-b (/ — 1), S[zz[/]].start — 1) +1 < min(i?.end— {q — i'), S[zz[/]].end+ 1) — 1 = S[zz[z^]].start, or 
/ = g-b 1 and G.start = i?.end— 1 > max(i?.start-b (g — 1),S[zz[g]].start— 1) -b 1 = S[zi[g]].end. 

Cascading part 

Next, we shall prove that the cascade at the end finishes the modification of S to a solution for I. Just as in the main for-loop, 
this while-loop makes j trace out some jumping sequence, but we will not actually need to prove so much. Let f[l + l] be the 
value of j after I iterations. It suffices to handle the case that dir = “righf’. Note that / is increasing, because either it has 
only one element, or G ^ and so G.end > [/.end since there is some increasing jumping sequence v that covers i?.end, in 
which case -i(G.start < [/.start) since span([/) > 1. After each iteration of the while-loop, S' is a solution for I except for 
an exact overlap at Sb’j, so Jump is done if the while-loop terminates. To check that this always occurs, we shall prove that 
z;[Z -b 1 ] < f[l + l] after I iterations of the while-loop as follows. 

After 0 iterations, the invariance holds trivially because z;[l] = r = /[!]. After I iterations where I > 0, 
let l' = max({z : zG [l..#(v)] Az;[z] </[l]}) > I by the invariance. Then by the monotonicity of v, /, 

span(lL[/[l..Z]]) 3 span(kL[u[l../^]]). Also, G C R <Z span(kL[w]), but G ^ span(lL[z;[l..l^]]) because 

G ^ span(ll/[/[l..Z]]) by the while-loop condition. Thus l' < #(z;) and so v[l'] < f[l] < v[l' + 1] and 

-b 1 ]], which gives fL[/[/]] 3 [kL[z;[;' -b 1]]. start, lL[z;[?^]].end] D -b Ij], and hence 

f[l + l] = max({z : z G [l..n] AS[z] C IL)/]/]] }) > -b 1] > w[/ -b 1]. 
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Therefore the while-loop runs for at most (#(u) — 1) iterations, otherwise G C span(lT^[u]) C span(fT[/[l..:;^(u)]]), contra¬ 
dicting the while-loop condition on iteration #(w). Thus the cascading part makes at most (#(f) — 1) reallocations, of which 
at most #(t^o) ~ 1 = jmp < jumps are to slots outside R. 

Total costs 

In total Jump makes at most {m+jumps) reallocations and takes 0(\og{n)‘^ +m-log{n) + jumps-log^n)) time. 

Initial phase 

What remains to establish the lemma is to prove that the final phase is entered in either of the two specified situations. So 
we shall look at the jumping part of the initial phase, just before V is modified. Let V' be the new value of V after the 
modification, and let s = span(L) and s' = span(L^) and z = count(L). 


Property 2 

Next, consider Property 2 as specified in the lemma statement (Lemma 11.9). Then s — ( 2 - 1 - 2 ) < space(L) < blocks = 
Thus s< 2 +^ + 3 = 2 ^+3, andhence 2 >(s- 3 )^ = (s-3)J^>(s-3)|^>0 
since s > c > 3. Thus 2 > 0 and so j slots = {i : i € [l..n]\{r} AS'[i] GV}y^0. Let a = min(jsZofs) and b = ma:x.{j slots). 
Then j^{[a..b]) > 2 , which implies that s' > span(VL[a..&]) > 2(1 + e). 

From these we get s' > {s — 3) 2+21 T £) = (s — 3)(1 -I- ^e), which gives — (| -I- 3) > (s — (| -1-3)) (1 -f ^e), and by 
induction s-(f+3) > (c-(f-f 3)) (1 + > ^(1 + As before, (s-(|+3))i£ < s'- s < 2(c-l) 

because L^end = VF[6].end < 5'[6].end-l- (c— 1) < L.end-f (c— 1) and similarly L^start > L.start— (c— 1), and hence 
s —(j-1-3) < Therefore jmp < log, 1 (^), and so the for-loop will always enter the final phase since 

jumps > max(^log^_^i^(^),0^ -f 1. 

Property 3 


First, consider Property 3 as specified in the lemma statement (Lemma 11.9). By symmetry it suffices to consider the 
case of dir = “right”. Since L.start = [/.start and no slot straddles [/.start, s — (2 -f 1) < space(L) < blocks = 
= Thus s< 2+^+2 = 2 ^ + 2, and hence 2 >(s- 2 )^ = (s- 2)|^. We have 

[/.end = lL[r].end > i/[r].end > E[r' + Ij.end -|- (r — — 1)(1 -f £), which gives s — 1 = span(l/) — 1 > span([/) — 1 

> (r — r' — 1)(1 -f £) -f L^[/].end — [/.start — 1 > {r — r' — 1)(1 -f e) -I- max(l + ^£, -^^- 3 ^) and so s > 2. Thus 
2 >r — / — 1>0 and so j slots = {i : i € [l..n]\{r} AS”)/] QV} 0, and hence max(jsZo/s) > / -f 2 -I- 1 since 
jslots C [r' + l..n]\{r}, which implies that rend > i?[max(jsZo/s)].end > E[r' + 1].end-f 2(1 -I- £). 


From these we get the following inequality: 

s' — 1 = rend — [/.start — 1 

> 2(1 -I-£) -I- {E[r' + 1 ].end — [/.start — 1) 

> (s-2)^^(1-I-£)-I-(1-I-5£) 

- (®“2)^^(1-|-£)-|-(1-|-5£) 

= (s- 1)(1-|- 2£). 

Thus by induction s — 1 > ■^—3^(1 + Furthermore, rearranging the inequality results in (s — 1 )^£ < s' — s < c—1 

because L[end = Vl/[max(js/o/s)].end< 5 '[max(js^ofs)].end-f (c— 1) < L.end-|-(c— 1 ), and hence s — 1 < . Therefore 


jmp < log^^ 1 g , and so the for-loop will always enter the final phase since jumps > max^log^^ 1 ^ 10 


-fl. 


Conclusion 


It is now trivial to verify the remaining properties that we have not explicitly justified. Therefore the lemma is proven at last. 
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9 Variable window length 


In this section we now turn to the variant of the problem that allows variable window lengths. We shall first show that in 
the case of sufficiently small slack there is no efficient allocator. Subsequently we present for the case of sufficiently large 
underallocation an allocator that we conjecture to be efficient, but we have been unable to prove it. There are also a number 
of unanswered questions that do not fall under either of these two cases. 

9.1 Small slack 

Even if the windows must have bounded relative ratio, if the slack is sufficiently small, it is not difficult to come up with a 
sequence of operations that requires ^2(n) reallocations per operation on average with at most n tasks in the system at any 
time. One example is as follows. Insert tasks with windows [2i,2i + 'S\y and [2i + l,2i + 2]y for each i € [1..A:] on 1 processor 
where y = 1 + e. Then repeatedly insert and delete a task with window alternating between [2, 3 ]y and [2k + 2,2k + 3]y. The 
instances are all clearly e-slack with relative window ratio at most 3, but require k reallocations for each alternation for any 
£€(0,1/3). 

In general, if the relative window ratio is allowed to be at most ^ + for some positive integer m, there is a pair of e- 
slack instances with (m + l)fc +1 tasks along the same lines such that alternating between them requires k reallocations per 
operation for any £ € (O, 2m+l ) ’ having k groups of m + 1 windows, each group having m windows of length my centred 
with 1 window of length (m + 2)y, the larger windows in adjacent groups overlapping by y, where y = 1 + £. For p processors 
the situation is no better, since the above instances can be just multiplied into p copies, which require ) reallocations 

per operation. 

This answers the question posed by Bender et al. in the negative, but it leads to others: If £ > ^, or if £ € (0,1) and the relative 
window ratio is at most 1 + is there an allocator such that the number of reallocations it makes on each insertion depends 
only on the slack £? If one exists, it definitely cannot work on arbitrary feasible insert states because it is easy to construct 
some that necessitate 12 (log (n)) reallocations as we shall see in the next section. 

9.2 Large underallocation 

For large underallocation, it seems that it is more natural to describe £-slack instances as y-underallocated where y = 1 + £. 
Also, a useful special type of instance is one with aligned windows, windows whose endpoints are consecutive powers of 2. 
The reason is that aligning all the windows of a 4y-underallocated instance gives a y-underallocated instance for any y that is 
a power of 2, so for sufficiently large underallocation it reduces to solving the problem for the aligned instance. 

Unlike the fixed window variant, for any allocator A, no matter how large y is, there is some sequence of operations comprising 
just insertions such that the instance is always both aligned and y-underallocated but requires at least one reallocation, and 
furthermore there is a y-underallocated aligned insert state that requires 12 ^ log(Y) ) reallocations where n is the current 
number of tasks in the system. The latter also means that any allocator that takes o(log(n)) reallocations cannot work on 
generic insert states and must avoid such bad insert states. 

The rest of this section contain the precise statements and proofs of the above theorems. Finally, we set forth an allocator 
VA and our hypothesis that it takes only at most 1 reallocation per insertion as long as the aligned instance is always 2- 
underallocated. 

Definition 12 (Aligned interval). Call an interval aligned iff it is [a, a -I-1]2^ for some fc, a € Z. 

Definition 13 (Align). For any interval X, let align(A') be some largest aligned interval within X. Also, whenever we say 
that we align X we mean that we change it to align(A'). 

Definition 14 (Interval’s halves). For any interval X, let left(X) and right (X) be the left half of X and right half of X 
respectively. 

Definition 15 (Aligned interval’s parent). For any aligned interval X, let parent (AT) be the unique aligned interval of length 
2span(Ar) that contains X. 

Definition 16 (Aligned interval’s sibling). For any aligned interval X, let sibling(X) be the unique aligned interval Y such 
that parent(X) = parent(F) and X ^Y. 

Theorem 17 (Align’s properties). Take any aligned interval X. Then there is some interval Y with length 4span(Ar) such 
that (Y D Z fov any interval Z such that align (Z) C X ). 
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Proof. Let k,aG'Z such that X = [a2^, (a +1)2^]. By symmetry we can assume that a is even. Let Y = [(a —2)2^, (a + 2)2*^]. 
Then span(F) = 4span(X). Now take any interval Z such that align(Z’) C X. Then Z.start > (a — 2)2^ otherwise align(Z) 
is smaller than the aligned window [(a — 2)2^, a2^] C Z. Similarly if align(Z) = X, we have Z.end < (a+ 2)2^ otherwise 
align(Z) is smaller than [a2^, (o + 2)2^] C Z. Finally, if align(Z) X, we again have Z.end < (a+ 2)2^ otherwise align(Z) 
is smaller than [(a+1)2^, (o +2)2^] C Z. Therefore in all cases Z CY. 

Theorem 18 (Alignment Reduction). Take any fc G Z and y = 2^ and dy-underallocated instance I. Let J he, I with all 
windows aligned. Then J is y-underallocated. 

Proof. Since J has finitely many tasks, we can recursively allocate the tasks by structural induction on aligned intervals, the 
invariance being that all tasks with smaller windows can be allocated to aligned slots. At each step, take some smallest aligned 
interval W with length x such that at least one unallocated task in J has window W. Then W is either identical to or disjoint 
from the window for every unallocated task. By Align’s properties (Theorem 17), all the windows in I that are aligned to 
within IL in J are within an interval of length 4a;, and hence there are at most ^ such windows since I is 4y-underallocated. 
Thus there are at most ^ tasks in J with window within W, and we can allocate those of them that are currently unallocated 
to aligned slots by the invariance and since y is a power of 2. Therefore J has a y-solution. 


Remark. The condition in Alignment Reduction that y is a power of 2 cannot be omitted as the theorem is false if y = 7. A 
counterexample is an instance with 9 tasks where 8 of them have window [1,2® — 1] and one has window 2*^ + 4-2® + [—14,14], 
which can align to 2° + [0,8 • 2'^] and 2° + [3 ■ 2'^, 4 ■ 2'’] respectively. The unaligned instance is 4(7)-underallocated because 
9 • 4(7) < 2® — 2, but the aligned instance is not 7-underallocated, because 2® + [0,4- 2®] and 2® + [3 • 2®, 8 ■ 2®] can accomodate 
only 4 and 5 slots of length 7 respectively, since 5 • 7 > 4 • 2® and 6-7 > 5-2®. 


Definition 19 (Density). Take any allocator A. At any point in time, let I be the current instance and S be the set of slots in 
the current solution maintained by A. For any interval A, let density(A) = span(X) 

Theorem 20 (Reallocation Requirement). Take any allocator A and y G N"*". Let I be the current instance and S be the 
set of slots in the current solution maintained by A. Then there is some sequence of 2^^“^ insertions such that I is always 
y-underallocated but A makes at least one reallocation. 

Proof. If A does not perform any reallocation for any insertion sequence, we do the following. Set Wi = [0,y-2^^“®]. We 
shall now construct the insertion sequence inductively such that the following invariances hold after step k for each k from 0 
to 2y — 1: 


1. span(lLfc+i)=y22T-(fc+i). 

2. density(ILfc+i) > 

3. /is y-underallocated. 


At step k from 1 to 2y — 1, insert 2^''' ^ ^ tasks all with window Wj^. Let A = left(V14) and B = right(lTfc). Then 
by Invariances 1,2 density)!)^;) > before these insertions, and hence density(tFfc) > 


fc-l 

2y 


22y 


-k-1 


> 


after the 


- 2y 

insertions. Thus max(density(A), density(i3)) > Set VFfc+i G {A,B} such that density(kFfc_|_i) > Therefore 
Invariances 1,2 are preserved. Note that / is still y-underallocated, because all the tasks with windows Wi can be allocated to 
y-length slots within for each i G [l..fc — 1], giving Invariance 3. 


After (2y — 1) steps, we have inserted (2^'^“^ — 1) tasks and density(W 2 Y) > 1. Insert one more task with window W 2 y, 
which is possible because I is still y-underallocated by the same argument as before. Now density(kF 2 y) > which is 
impossible. 


Therefore the theorem follows. 

Remark. Note that Reallocation Requirement holds for any y G N"*", but if y is also a power of 2, the inserted windows will 
be aligned, and hence reallocations are necessary even if the instance is always aligned. 

Theorem 21 (Underallocation Requirement). Take any allocator A. Then there is some sequence of operations that insert 
only tasks with aligned windows such that A makes log(n) reallocations on every subsequent insertion after an initial segment 
of the sequence, where n is the number of tasks in the system. 
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Proof. Take any A: G N. In the setup phase, insert one task with window X for every aligned interval X within [0,2^] of length 
at least 2. This is clearly feasible and the setup phase is complete. Let I = {n,T,W) be the current solution. Start from the 
interval C = [0,2^]. While C has length at least 2, we have C = W[i] for some i, and so update C to the half that overlaps 
S'[i]. Then if a new task is inserted with window within the new C, S'[i] must be reallocated. This makes C trace a path of 
k-\-l distinct intervals. At the end C is an interval of length 1, so insert a task with window C. It is not too hard to check 
by induction that I is feasible and now n = 2^, and hence A will have to make k = log(n) reallocations. After that, delete 
the task with window C, upon which there is yet again exactly one task with window X for every aligned interval X within 
[0,2^] of length at least 2. Repeating then yields the theorem. 

Remark. Note that Underallocation Requirement holds even if reallocations are allowed on deletions, since the same proof 
works. 

Theorem 22 (Non-Genericity). Take any y = 2^ for some A: G N. Then there is some insert state {I, S, r) with aligned 
y-underallocated I that requires at least log 2 Y(n— 1) reallocations to be solved. 

Proof. Take any m G N. Let Xi = [0, (2y)*] for each i G and dehne an insert state I = (n, T, W) as follows: 

n= ( 27 ^ + 1 . 

W[i] = Xjj^i for each A G [(2'y)-^“^ + l..(2y)-^] for each j G 
W[n]=Xi. 

= i+ [—1,0] for each i G [l..n — 1]. 

^[n] = null. 

Then I is y-underallocated, because it has a y-solution G where G\i\ = i{2y) + [—y, 0] for each i G [l..n — 1] and G[n] = 
[0,y], since G[A].end = i{2y) < (2y)'^'*'^ = Wj_|_i.end = lU[A].end for each i G [(2y)-^“^ -f l..(2y)-^] for each j G [1. .m]. Also, 
Wj contains (2y)-^ slots for each j G [1..to] since Aj .end = ( 27 )-^ <n — l. 

Now take any sequence of reallocations that modihes S' to a solution S' for /. Clearly at least one slot within Xj must 
be reallocated from inside to outside Xj for each j G since there are now too many slots within Xj, and for such a 

reallocated slot S[i] it must be that W\i\ — Xj^i, because W[i\ is larger than Xj but not larger than Xjj^i since originally 
i < (2y )-^. Therefore these slots that must be reallocated are distinct for distinct j, which implies that solving (/, S, r) requires 
reallocating at least m = log 2 Y(?r — 1 ) slots. 

Algorithm 23 (VA). 

Variables 

Ordered instance I = {n,T,W) // current aligned instance ; must be feasible before and after each operation 
Allocation S // current allocation for / ; must be a solution for I before and after each operation 

Initialization 

Set/ = (n,T,lU) = (0,0,0) and ,5=0. 

External Interface 

Procedure Insert( task t , window w) // inserts task t with window w into the system 
Procedure Delete( task t) // deletes task t from the system 

Implementation 

Subroutinecount( aligned interval X ): 

Return #({i : S'[i] C X }). 

Subroutine high( aUgned interval X ): 

Return max({span(lU[j]) : ^[i] C X }). 

Subroutine best( aligned interval X ): 

If span(A) = 1: 

Return A. 

Set A = left (A). 

Set B = right(A). 

Return best( count(A) < count(i?) 1 A: { count(A) > count(i3) 1 B : { high(A) > high(B) 1 A : B )) ). 
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Subroutine bad( aligned interval X ): 

If span(X) = 1: 

Return X. 

Set A — left(X). 

Set B = right(X). 

Return bad( high(A) > high(i?) 7 A : ( high(A) < high(i3) 7 B : { count(yl) > count(i?) 7 A : B )) ). 

Subroutine imbalance! aligned interval X ): 

Return ( count(X) < 1 ? null: ( count(X) > count(sibling(X)) +1 ? X : imbalance(parent(X)) ) ). 

Procedure Insert! task t, window w ): 

// Create the insert state // 

Set w' = Align(w). 

Set (/ = {n,T,W),S,r) to be the insert state on insertion of {t,w') into {I,S). 

// Check if the instance is 2-underallocated // 

If I is not 2-underallocated: 

Return Failure. 

II Insert // 

Set S'[r] = best(kF[r]). 

// Solve overlap // 

If S'[i] = S'[r] for some i^r\ 

Set S'[t] = best(VF[i]). 

Return Success. 

H Correct imbalance // 

Set X = imbalance(S'[r]). 

If X ^ null: 

Set i such that S[t] = bad(-A). 

Set S[t] = best(VF[i]). 

Return Success. 

Procedure Delete! task t ): 

If f e T: 

Delete t from (/, S). 

Return Success. 

Otherwise: 

Return Failure. 

Conjecture 24 !VA’s properties). On an insertion of a new task, if the new aligned instance is 2-underallocated, VA has the 
following properties: 

■<r’ If the insertion is feasible, it returns Success after updating S to be a solution for the new instance by making at most 1 
reallocations and taking 0(log(n)) time. 

<f- If the insertion is not feasible, it returns Failure in 0(log(n)) time. 

Remark. If VA’s properties hold, it means that it is enough that the unaligned instance remains 8-underallocated, because by 
Alignment Reduction the aligned instance would be 2-underallocated. 


26 


10 Multiple processors 


In this section we give a simple reduction from the multi-processor problem to the single-processor problem as well as a bound 
on the necessary slack for an efficient allocator to exist. 


10.1 Reduction to one processor 


For sufficiently large underallocation, the problem for p processors can be ‘solved’ by using any single-processor allocator A 
on a transformed version of the problem where all windows have the same start time but length shortened by 1 — and the 

task length is also shortened by the same amount to where k = • Each time A makes a sequence of allocations to 


maintain a solution S in the transformed problem, we deallocate all the corresponding tasks in the original problem, and then 
allocate each one to a slot with the same start time as in S' to a processor where it can actually fit. Such a processor must exist 
because the number of slots in the original system that overlap the desired slot is at most 2k —1 <p. 


Notice that this method can only utilize an odd number of processors, otherwise it will essentially discard one processor! 
Also, any e-slack instance with p identical windows of length 1 -f e would under this transformation become an instance with 
identical windows of length £ + ■^, which can accommodate all the p slots of length ^ only if > which implies e > 


Eni > pPzii 

k -^p+1- 


On the other hand, if p is odd and / is the original e-slack instance when using p processors, for some e > 22qiY, then the 
transformed instance J has an (e-f ■|)-solution S using p processors. Thus J has an (e-f ■|)i-solution using 1 processor, 
because we can allocate the tasks in order of their corresponding slots in S, each to the leftmost possible position within its 
corresponding slot in S, which is always possible because the depth of the arrangement of slots in S is at most p. Hence J for 
1 processor will have slack at least — 1= > Q. 

This implies that any single-processor allocator that works if the instance remains y-underallocated would give a multi¬ 
processor allocator that works if the instance remains (2y -f 1)-underallocated. 


10.2 Inefficiency for small slack 

For p processors where p > 1, it turns out that there is no efficient allocator for arbitrarily small slack. Specifically, given any 
£ < , even if the instance is always e-slack, it is impossible to avoid reallocating tasks per operation on average for 

the following operation sequence. 

Insert for each i G [—k..k] a group (indexed by i) of p tasks with windows [1 -f i -f ^ 1](1 -f e) for j G [l..p]. There is 

an essentially unique solution up to a relabeling of processors, because 2(1 + e) — < 2. Call this Position 1. Now delete 

the p tasks in group 0 and insert {p— 1) tasks with windows [i + ^ + p + ^ + 1](1 + J ^ !]■ Since 

2(1 -f e) — < 2, adjacent tasks on the same processor cannot have overlapping windows, and there is again an essentially 

unique solution. Call this Position 2. We just need to perform (p — 1) deletions and p insertions to return from Position 2 to 
Position 1. Furthermore, it is easy to check that kp reallocations are needed to get from one position to another. Alternating 
in this way between the two positions forces at least 2kp reallocations for every 2(2p— 1) operations, despite the instance 
remaining e-slack throughout. 
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11 Variable task lengths 


Up to now we have only considered unit-length tasks, in which case it does not matter greatly whether we have a y- 
underallocated instance prior to insertion or after the insertion. However, this distinction becomes important in the case 
of variable-length tasks, as the two examples below will demonstrate. 

For any y > 1 the following sequence of operations require f2(n) reallocations per operation on average with at most n tasks in 
the system at any time. Insert k tasks with length 1 and window [1,3]fcy. Then insert 2 tasks with window Wi = [0, 2]ky, the 
first with length k and the second with length (2fcy — fc). Then delete the last two tasks and reinsert them but both with window 
W 2 = [2,4]fcy instead. Alternating between Wi and W 2 would keep the instance y-underallocated before each insertion and 
require k reallocations per alternation. 


However, if we want the instance to be always y-underallocated even after insertion, then the above example does not work. 
But if y < 2, we can show that the following operation sequence forces r2(n) reallocations per operation. Let m = 


2 

2=y 


> 


2 and r = ^ and fc € N such that rk € N. Insert (1 — r)fc tasks with length 1 and window [0, l]ky. Let V be an interval 
such that span(U) = rky and density(U) > now, which exists by pigeonhole principle since rky | ky. Then insert 

1 task with length rk and window V. The instance is still y-underallocated but ^^^rky yrk'j — rky = {2 — y — r)rk > 

(2 — y — = ^ 2 { 4 -y) ^ — 5(2 ~hence at least that number of tasks must be reallocated. Repeatedly 

deleting and reinserting that task as above gives the claim. This leaves the case of y > 2 unanswered. 


In both examples, making p copies of each insertion and deletion produces a sequence that forces reallocations per 

operation. 


12 Open questions 

Firstly, is VA’s properties (Conjecture 24) true? If so, it would be essentially optimal in terms of the worst case on insertion. 
If not, is there some y and m and an allocator that takes at most m reallocations on each insertion given that the instance 
always remains y-underallocated? Also, what is the minimum such y or m for the unaligned and aligned cases? Based on the 
examples in this paper, we guess that y = ^ is minimal in the unaligned case and m = 1 is minimal in both cases. 

Secondly, if variable-length tasks are allowed, we have shown that it is not efficiently solvable if the lower bound on the 
underallocation is less than 2, so is there an allocator that takes 0(1) reallocations on each insertion given that the instance 
always remains 2-underallocated? We again guess that such an allocator exists. 

Thirdly, the multi-processor reduction only works when the instance is more than 2 2^-slack, but there is no obvious reason 

why there should not be an efficient allocator if the instance remains just -slack. What is the optimal allocator in that 
case? 
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